Automatic Attribution of Quoted Speech in Literary Narrative

نویسندگان

  • David K. Elson
  • Kathleen McKeown
چکیده

We describe a method for identifying the speakers of quoted speech in natural-language textual stories. We have assembled a corpus of more than 3,000 quotations, whose speakers (if any) are manually identified, from a collection of 19th and 20th century literature by six authors. Using rule-based and statistical learning, our method identifies candidate characters, determines their genders, and attributes each quote to the most likely speaker. We divide the quotes into syntactic classes in order to leverage common discourse patterns, which enable rapid attribution for many quotes. We apply learning algorithms to the remainder and achieve an overall accuracy of 83%.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

برجسته سازی در خطبۀ فدکیه حضرت زهرا(ع)

Foregrounding is one of the contemporary literary theories, which from a literary perspective to texts, in prose or verse, endeavors to explain and analyze those effective features and elements in the body of the discourse which rhetorically distinguish literary texts from ordinary ones. According to the Formalists, foregrounding is achieved through diminishing or increasing the rules. In other...

متن کامل

Identifying Speakers and Listeners of Quoted Speech in Literary Works

We present the first study that evaluates both speaker and listener identification for direct speech in literary texts. Our approach consists of two steps: identification of speakers and listeners near the quotes, and dialogue chain segmentation. Evaluation results show that this approach outperforms a rule-based approach that is stateof-the-art on a corpus of literary texts.

متن کامل

The Actor-Topic Model for Extracting Social Networks in Literary Narrative

We present a generative model for conversational dialogues, namely the actortopic model (ACTM), that extend the author-topic model (Rosen-Zvi, et.al, 2004) to identify actors of given conversation in literary narratives. Thus ACTM assigns each instance of quoted speech to an appropriate character. We model dialogues in a literary text, which take place between two or more actors conversing on d...

متن کامل

Computational Methods for Coptic

This paper motivates and details the first implementation of a freely available part of speech tag set and tagger for Coptic. Coptic is the last phase of the Egyptian language family and a descendant of the hieroglyphs of ancient Egypt. Unlike classical Greek and Latin, few resources for digital and computational work have existed for ancient Egyptian language and literature until now. We evalu...

متن کامل

Automatic Analysis and Annotation of Literary Texts

In this work a machine learning oriented perspective on computer aided support to literary analysis is presented. A representation of narrative phenomena is proposed and an automatic annotation model for such phenomena is trained on texts provided by a critic. As a short-term research task, we studied how the observable textual piece of evidence impact on the learning agent capabilities, over a...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2010